Learning Auxiliary Monocular Contexts Helps Monocular 3D Object Detection
نویسندگان
چکیده
Monocular 3D object detection aims to localize bounding boxes in an input single 2D image. It is a highly challenging problem and remains open, especially when no extra information (e.g., depth, lidar and/or multi-frames) can be leveraged training inference. This paper proposes simple yet effective formulation for monocular without exploiting any information. presents the MonoCon method which learns Contexts, as auxiliary tasks training, help detection. The key idea that with annotated of objects image, there rich set well-posed projected supervision signals available such corner keypoints their associated offset vectors respect center box, should exploited training. proposed motivated by Cramer–Wold theorem measure theory at high level. In implementation, it utilizes very end-to-end design justify effectiveness learning contexts, consists three components: Deep Neural Network (DNN) based feature backbone, number regression head branches essential parameters used box prediction, contexts. After context are discarded better inference efficiency. experiments, tested KITTI benchmark (car, pedestrian cyclist). outperforms all prior arts leaderboard on car category obtains comparable performance cyclist terms accuracy. Thanks design, fastest speed 38.7 fps comparisons. Our code released https://git.io/MonoCon.
منابع مشابه
Monocular Object Detection Using 3D Geometric Primitives
Multiview object detection methods achieve robustness in adverse imaging conditions by exploiting projective consistency across views. In this paper, we present an algorithm that achieves performance comparable to multiview methods from a single camera by employing geometric primitives as proxies for the true 3D shape of objects, such as pedestrians or vehicles. Our key insight is that for a ca...
متن کاملMonocular Vision-Based Underwater Object Detection
In this paper, we propose an underwater object detection method using monocular vision sensors. In addition to commonly used visual features such as color and intensity, we investigate the potential of underwater object detection using light transmission information. The global contrast of various features is used to initially identify the region of interest (ROI), which is then filtered by the...
متن کاملRobust 3D Object Tracking from Monocular Images using Stable Parts.
We present an algorithm for estimating the pose of a rigid object in real-time under challenging conditions. Our method effectively handles poorly textured objects in cluttered, changing environments, even when their appearance is corrupted by large occlusions, and it relies on grayscale images to handle metallic environments on which depth cameras would fail. As a result, our method is suitabl...
متن کاملMonocular Multiview Object Tracking with 3D Aspect Parts
In this work, we focus on the problem of tracking objects under significant viewpoint variations, which poses a big challenge to traditional object tracking methods. We propose a novel method to track an object and estimate its continuous pose and part locations under severe viewpoint change. In order to handle the change in topological appearance introduced by viewpoint transformations, we rep...
متن کاملReal-time monocular object SLAM
We present a real-time object-based SLAM system that leverages the largest object database to date. Our approach comprises two main components: 1) a monocular SLAM algorithm that exploits object rigidity constraints to improve the map and find its real scale, and 2) a novel object recognition algorithm based on bags of binary words, which provides live detections with a database of 500 3D objec...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2022
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v36i2.20074